Should We Redesign Forecasting Competitions?
نویسنده
چکیده
The M3-Competition continues to improve the design of forecasting competitions: It examines more series than any previous competition, improves error analyses and includes commercial forecasting programs as competitors. To judge where to go from here, I step back to look at the M-Competitions as a whole. I discuss the advantages of the M-Competitions in hopes that they will be retained, describe how to gain additional benefit from future competitions, and finally, describe a low-cost approach to competitions. Comments Postprint version. Published in International Journal of Forecasting, Volume 17, 2001, pages 542-545. This journal article is available at ScholarlyCommons: http://repository.upenn.edu/marketing_papers/56 Published in International Journal of Forecasting, 17, 2001, 542-545. Should We Redesign Forecasting Competitions? J. Scott Armstrong The Wharton School, University of Pennsylvania The M3-Competition continues to improve the design of forecasting competitions: It examines more series than any previous competition, improves error analyses. and includes commercial forecasting programs as competitors. To judge where to go from here, I step back to look at the M-Competitions as a whole. I discuss the advantages of the M-Competitions in hopes that they will be retained, describe how to gain additional benefit from future competitions, and finally, describe a low-cost approach to competitions. 1. Favorable design aspects of the M-Competitions The M-Competitions provide a model for conducting scientific research. They employ at least five key aspects: empirical testing, multiple hypotheses, large samples, independent validation, and full disclosure. While these aspects might seem obvious, studies in management science seldom include all of them. 1.1. Empirical testing Empirical testing is necessary to test forecasting methods. Despite the resistance of timeseries researchers (Fildes & Makridakis, 1995), interest in empirical studies has been growing among forecasters. Forecasting journals now publish many empirical comparisons. The MCompetitions have led the way in such comparisons. 1.2. Multiple hypotheses Academic researchers rely heavily upon advocacy (Armstrong, Brodie, & Parsons, 2001b); they develop what they believe to be the beat method (hypothesis), then seek information to support it. It is uncommon in management science for a researcher to examine competing hypotheses. However, nearly 60% of empirical papers in the Journal of Forecasting and the International Journal of Forecasting tested competing hypotheses (Armstrong, 1989). The M-Competitions have been exemplary in providing open calls, thus allowing those with different approaches to participate. 1.3. Large samples Testing should be done from large samples. However, many academic studies, including those in forecasting, do not use large samples. You need only to pick up the latest copies of journals to observe this. The M-Competitions were a departure from this norm. The first forecasting competition (Makridakis & Hibon, 1979) examined 111 series (considered large at the time) and the MCompetition (Makridakis et al., 1982) examined 1001. 1.4. Independent validation on a common data base The methods in the competitions were tested on a common holdout database by a researcher who examined the accuracy of forecasts submitted by the competitors. This testing procedure avoided problems inherent in drawing conclusions from prior research in which databases are different. 1.5. Full disclosure Full disclosure is important to allow others to conduct replications and extensions. Despite a consensus among researchers that replication is vital in advancing scientific knowledge, the number of published replications in the management sciences is negligible, and there are few
منابع مشابه
Should We Redesign Forecasting Competitions? 1.4. Independent Validation on a Common Data Base
The M3-Competition continues to improve the design of forecasting competitions: It examines more series than any previous competition, improves error analyses. and includes commercial forecasting programs as competitors. To judge where to go from here, I step back to look at the M-Competitions as a whole. I discuss the advantages of the M-Competitions in hopes that they will be retained, descri...
متن کاملThe value of feedback in forecasting competitions
In this paper we challenge the traditional design used for forecasting competitions. We implement an online competition with a public leaderboard that provides instant feedback to competitors who are allowed to revise and resubmit forecasts. The results show that feedback significantly improves forecasting accuracy.
متن کامل"the Impact of Empirical Accuracy St1udies on Time Sertes Analysis and Forecasting"
Social scientists envy the objectivity, controlled experimentation and replicability of hard sciences, a lack of which, they daim, hampers their ability to advance their disciplines and make them more useful and relevant to real life applications. This paper examines a specific area of social science, time series forecasting, which, through empirical studies using real-life data, allows for obj...
متن کاملAutomatic forecasting with a modified exponential smoothing state space framework
A new automatic forecasting procedure is proposed based on a recent exponential smoothing framework which incorporates a Box-Cox transformation and ARMA residual corrections. The procedure is complete with well-defined methods for initialization, estimation, likelihood evaluation, and analytical derivation of point and interval predictions under a Gaussian error assumption. The algorithm is exa...
متن کاملBoosting multi-step autoregressive forecasts
Multi-step forecasts can be produced recursively by iterating a one-step model, or directly using a specific model for each horizon. Choosing between these two strategies is not an easy task since it involves a trade-off between bias and estimation variance over the forecast horizon. Using a nonlinear machine learning model makes the tradeoff even more difficult. To address this issue, we propo...
متن کامل